Article Details

Text Preprocessing and Classification Using Machine Learning Technique | Original Article

Amit Kumar Dewangan S. M. Ghosh A. K.Shrivas in Anusandhan (RNTUJ-AN) | Multidisciplinary Academic Research

ABSTRACT:

Sentimental analysis is the method of finding sentiment such as positive or negative from a text data. In this paper we have used some feature selection techniques such as Mutual information, Information gain and TF-IDF to select features from high dimensionality data set. These methods are evaluated over dataset consists of 2000 user-created movie reviews archived on the IMDb (Internet Movie Database) web portal  and is known as “Sentiment Polarity Dataset version 2.0” . The reviews are equally partitioned into a positive set and a negative set (1000+1000).  The machine learning play very important role for preprocessing and classification of data. The classification is performed using support vector machine(SVM), Random Forest, Random Tree, Naïve Bayes, Bayes Net and J48. We have used ensemble model to achieve high accuracy provided by WEKA tool.